Using Threads
Learn to use threads.
We'll cover the following
What are threads?#
Threads in Python are a good way to run a function concurrently with other functions. If your system does not support multiple processors, the threads will be executed one after another as scheduled by the operating system. However, if multiple CPUs are available, threads could be scheduled on multiple processing units, once again as determined by the operating system.
By default, there is only one thread, the main thread, and it is the thread that runs your Python application. To start another thread, Python provides the threading module.
To run the following code snippet, press the Run button and enter the command
python2 threading-start.py.
To change the source code in the playground and run, press Run again after changing the code, then wait for the four step process of container creation to complete. Then click on the terminal to go back and issue the command to run the program again.
/
If you run the above example multiple times, you will notice that the output might be different each time. On my laptop, doing this gives the following:
If you specifically expected any one of the outputs each time, then you forgot that there is no guarantee regarding the order of execution for the threads.
Once started, the threads join: the main thread waits for the second thread to
complete by calling its join method. Using join is handy in terms of not leaving any
threads behind.
If you do not join all your threads and wait for them to finish, it is possible that the main thread finishes and exits before the other threads. If this happens, your program will appear to be blocked and will not respond to even a simple KeyboardInterrupt signal.
Note: We have demonstrated this effect using Python 2. However, in Python 3, the threading effect is not easily observable using small-scale examples.
Threads as daemons#
To avoid this, and because your program might not be in a position to wait for the threads, you can configure threads as daemons. When a thread is a daemon, it is considered as a background thread by Python and is terminated as soon as the main thread exits.
In the above example, there is no longer a need to use the join method since the
thread is set to be a daemon.
Running multiple threads#
The program below is a simple example, which sums one million random integers eight times, spread across eight threads at the same time.
To run the following code snippet, press the Run button and wait until all the commands finish running. Then, enter the command
time python multithreading-worker.py.
/
Running the above example with time command returns the result and stats about execution time and CPU usage.
The program ran on an idle dual cores CPU, which means that Python could have used up to 200% CPU power. However, it was unable to do that, even with eight threads running in parallel, it stuck at just above 100%, which is just above 50% of the hardware’s capabilities.
The bottleneck#
The following figure illustrates that bottleneck: to access all of the system’s CPU, you need to go through CPython’s GIL.
Again, as discussed in the previous chapter, the GIL limits the performance of CPython when executing multiple threads. Threads are therefore useful when doing parallel computing or input/output on slow networks or files. Those tasks can run in parallel without blocking the main thread.
Introduction to CPU Scaling
Using Processes